Extended abstract - Joint Statistical Meetings ( JSM ) , Montreal , August 2013 Overview and taxonomy of techniques for privacy - preserving record linkage

نویسنده

  • Peter Christen
چکیده

Record linkage is the process of identifying which records in two or more databases correspond to the same real-world entity. Three major challenges of this process are (1) achieving high linkage quality, (2) scalability to linking very large databases, and (3) protecting the privacy and confidentiality of personal identifying data that are used in the linkage process. This presentation provides an overview of the various techniques that have been developed to facilitate the linking of data across organisations in such ways that no private or confidential information is being revealed. We then characterise such privacy-preserving record linkage techniques along fifteen dimensions. This provides us with a taxonomy that allows us to highlight shortcomings of current techniques and discuss future research directions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overview and taxonomy of techniques for privacy-preserving record linkage

Motivation Large amounts of data are being collected both by organisations in the private and public sectors, as well as by individuals Much of these data are about people, or they are generated by people

متن کامل

Cryptanalysis of Basic Bloom Filters Used for Privacy Preserving Record Linkage

Linking databases containing information on individual characteristics and behavior is of increasing scientific and commercial interest. In many applications, linking databases has to be done without a unique personal number. Hence, due to privacy concerns, privacy preserving record linkage (PPRL) is used most often. In this context encrypted personal quasi-identifiers such as first names, surn...

متن کامل

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

Privacy-Preserving Record Linkage

Record linkage has a long tradition in both the statistical and the computer science literature. We survey current approaches to the record linkage problem in a privacy-aware setting and contrast these with the more traditional literature. We also identify several important open questions that pertain to private record linkage from different per-

متن کامل

Meeting Report: STAT-HAWKERS at the JSM-2013, Montreal, Canada

In this foreword, we attempt to recall some memories annotated with appropriate photographs from the booth STAT-HAWKERS at the Joint Statistical Meeting (JSM)-2013 held in Montreal, Canada during August 3–8, 2013. There were three main advertising posters: one for the book, “Thinking Statistically: Elephants Go to School”; second for the monograph, “Advanced Sampling Theory with Applications: H...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013